Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Curr Biol ; 34(5): 1098-1106.e5, 2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38218184

RESUMO

Visual shape perception is central to many everyday tasks, from object recognition to grasping and handling tools.1,2,3,4,5,6,7,8,9,10 Yet how shape is encoded in the visual system remains poorly understood. Here, we probed shape representations using visual aftereffects-perceptual distortions that occur following extended exposure to a stimulus.11,12,13,14,15,16,17 Such effects are thought to be caused by adaptation in neural populations that encode both simple, low-level stimulus characteristics17,18,19,20 and more abstract, high-level object features.21,22,23 To tease these two contributions apart, we used machine-learning methods to synthesize novel shapes in a multidimensional shape space, derived from a large database of natural shapes.24 Stimuli were carefully selected such that low-level and high-level adaptation models made distinct predictions about the shapes that observers would perceive following adaptation. We found that adaptation along vector trajectories in the high-level shape space predicted shape aftereffects better than simple low-level processes. Our findings reveal the central role of high-level statistical features in the visual representation of shape. The findings also hint that human vision is attuned to the distribution of shapes experienced in the natural environment.


Assuntos
Visão Ocular , Percepção Visual , Humanos , Distorção da Percepção , Meio Ambiente , Reconhecimento Visual de Modelos , Estimulação Luminosa
2.
Mem Cognit ; 2023 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-37668880

RESUMO

Many objects and materials in our environment are subject to transformations that alter their shape. For example, branches bend in the wind, ice melts, and paper crumples. Still, we recognize objects and materials across these changes, suggesting we can distinguish an object's original features from those caused by the transformations ("shape scission"). Yet, if we truly understand transformations, we should not only be able to identify their signatures but also actively apply the transformations to new objects (i.e., through imagination or mental simulation). Here, we investigated this ability using a drawing task. On a tablet computer, participants viewed a sample contour and its transformed version, and were asked to apply the same transformation to a test contour by drawing what the transformed test shape should look like. Thus, they had to (i) infer the transformation from the shape differences, (ii) envisage its application to the test shape, and (iii) draw the result. Our findings show that drawings were more similar to the ground truth transformed test shape than to the original test shape-demonstrating the inference and reproduction of transformations from observation. However, this was only observed for relatively simple shapes. The ability was also modulated by transformation type and magnitude but not by the similarity between sample and test shapes. Together, our findings suggest that we can distinguish between representations of original object shapes and their transformations, and can use visual imagery to mentally apply nonrigid transformations to observed objects, showing how we not only perceive but also 'understand' shape.

3.
Neural Netw ; 164: 228-244, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37156217

RESUMO

The contrast sensitivity function (CSF) is a fundamental signature of the visual system that has been measured extensively in several species. It is defined by the visibility threshold for sinusoidal gratings at all spatial frequencies. Here, we investigated the CSF in deep neural networks using the same 2AFC contrast detection paradigm as in human psychophysics. We examined 240 networks pretrained on several tasks. To obtain their corresponding CSFs, we trained a linear classifier on top of the extracted features from frozen pretrained networks. The linear classifier is exclusively trained on a contrast discrimination task with natural images. It has to find which of the two input images has higher contrast. The network's CSF is measured by detecting which one of two images contains a sinusoidal grating of varying orientation and spatial frequency. Our results demonstrate characteristics of the human CSF are manifested in deep networks both in the luminance channel (a band-limited inverted U-shaped function) and in the chromatic channels (two low-pass functions of similar properties). The exact shape of the networks' CSF appears to be task-dependent. The human CSF is better captured by networks trained on low-level visual tasks such as image-denoising or autoencoding. However, human-like CSF also emerges in mid- and high-level tasks such as edge detection and object recognition. Our analysis shows that human-like CSF appears in all architectures but at different depths of processing, some at early layers, while others in intermediate and final layers. Overall, these results suggest that (i) deep networks model the human CSF faithfully, making them suitable candidates for applications of image quality and compression, (ii) efficient/purposeful processing of the natural world drives the CSF shape, and (iii) visual representation from all levels of visual hierarchy contribute to the tuning curve of the CSF, in turn implying a function which we intuitively think of as modulated by low-level visual features may arise as a consequence of pooling from a larger set of neurons at all levels of the visual system.


Assuntos
Sensibilidades de Contraste , Percepção Visual , Humanos , Percepção Visual/fisiologia , Neurônios/fisiologia , Redes Neurais de Computação , Psicofísica , Reconhecimento Visual de Modelos/fisiologia
4.
Curr Biol ; 32(21): R1224-R1225, 2022 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-36347228

RESUMO

The discovery of mental rotation was one of the most significant landmarks in experimental psychology, leading to the ongoing assumption that to visually compare objects from different three-dimensional viewpoints, we use explicit internal simulations of object rotations, to 'mentally adjust' one object until it matches the other1. These rotations are thought to be performed on three-dimensional representations of the object, by literal analogy to physical rotations. In particular, it is thought that an imagined object is continuously adjusted at a constant three-dimensional angular rotation rate from its initial orientation to the final orientation through all intervening viewpoints2. While qualitative theories have tried to account for this phenomenon3, to date there has been no explicit, image-computable model of the underlying processes. As a result, there is no quantitative account of why some object viewpoints appear more similar to one another than others when the three-dimensional angular difference between them is the same4,5. We reasoned that the specific pattern of non-uniformities in the perception of viewpoints can reveal the visual computations underlying mental rotation. We therefore compared human viewpoint perception with a model based on the kind of two-dimensional 'optical flow' computations that are thought to underlie motion perception in biological vision6, finding that the model reproduces the specific errors that participants make. This suggests that mental rotation involves simulating the two-dimensional retinal image change that would occur when rotating objects. When we compare objects, we do not do so in a distal three-dimensional representation as previously assumed, but by measuring how much the proximal stimulus would change if we watched the object rotate, capturing perspectival appearance changes7.


Assuntos
Percepção de Movimento , Fluxo Óptico , Humanos , Reconhecimento Visual de Modelos , Percepção Visual
5.
Elife ; 112022 05 10.
Artigo em Inglês | MEDLINE | ID: mdl-35536739

RESUMO

Humans have the amazing ability to learn new visual concepts from just a single exemplar. How we achieve this remains mysterious. State-of-the-art theories suggest observers rely on internal 'generative models', which not only describe observed objects, but can also synthesize novel variations. However, compelling evidence for generative models in human one-shot learning remains sparse. In most studies, participants merely compare candidate objects created by the experimenters, rather than generating their own ideas. Here, we overcame this key limitation by presenting participants with 2D 'Exemplar' shapes and asking them to draw their own 'Variations' belonging to the same class. The drawings reveal that participants inferred-and synthesized-genuine novel categories that were far more varied than mere copies. Yet, there was striking agreement between participants about which shape features were most distinctive, and these tended to be preserved in the drawn Variations. Indeed, swapping distinctive parts caused objects to swap apparent category. Our findings suggest that internal generative models are key to how humans generalize from single exemplars. When observers see a novel object for the first time, they identify its most distinctive features and infer a generative model of its shape, allowing them to mentally synthesize plausible variants.


Assuntos
Generalização Psicológica , Aprendizagem , Humanos , Reconhecimento Visual de Modelos
6.
PLoS Comput Biol ; 17(6): e1008981, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-34061825

RESUMO

Shape is a defining feature of objects, and human observers can effortlessly compare shapes to determine how similar they are. Yet, to date, no image-computable model can predict how visually similar or different shapes appear. Such a model would be an invaluable tool for neuroscientists and could provide insights into computations underlying human shape perception. To address this need, we developed a model ('ShapeComp'), based on over 100 shape features (e.g., area, compactness, Fourier descriptors). When trained to capture the variance in a database of >25,000 animal silhouettes, ShapeComp accurately predicts human shape similarity judgments between pairs of shapes without fitting any parameters to human data. To test the model, we created carefully selected arrays of complex novel shapes using a Generative Adversarial Network trained on the animal silhouettes, which we presented to observers in a wide range of tasks. Our findings show that incorporating multiple ShapeComp dimensions facilitates the prediction of human shape similarity across a small number of shapes, and also captures much of the variance in the multiple arrangements of many shapes. ShapeComp outperforms both conventional pixel-based metrics and state-of-the-art convolutional neural networks, and can also be used to generate perceptually uniform stimulus sets, making it a powerful tool for investigating shape and object representations in the human brain.


Assuntos
Biologia Computacional/métodos , Reconhecimento Visual de Modelos , Animais , Humanos , Estimulação Luminosa
7.
Sci Rep ; 10(1): 22141, 2020 12 17.
Artigo em Inglês | MEDLINE | ID: mdl-33335146

RESUMO

Establishing correspondence between objects is fundamental for object constancy, similarity perception and identifying transformations. Previous studies measured point-to-point correspondence between objects before and after rigid and non-rigid shape transformations. However, we can also identify 'similar parts' on extremely different objects, such as butterflies and owls or lizards and whales. We measured point-to-point correspondence between such object pairs. In each trial, a dot was placed on the contour of one object, and participants had to place a dot on 'the corresponding location' of the other object. Responses show correspondence is established based on similarities between semantic parts (such as head, wings, or legs). We then measured correspondence between ambiguous objects with different labels (e.g., between 'duck' and 'rabbit' interpretations of the classic ambiguous figure). Despite identical geometries, correspondences were different across the interpretations, based on semantics (e.g., matching 'Head' to 'Head', 'Tail' to 'Tail'). We present a zero-parameter model based on labeled semantic part data (obtained from a different group of participants) that well explains our data and outperforms an alternative model based on contour curvature. This demonstrates how we establish correspondence between very different objects by evaluating similarity between semantic parts, combining perceptual organization and cognitive processes.

8.
Data Brief ; 29: 105302, 2020 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-32140517

RESUMO

With the advent of deep convolutional neural networks, machines now rival humans in terms of object categorization. The neural networks solve categorization with a hierarchical organization that shares a striking resemblance to their biological counterpart, leading to their status as a standard model of object recognition in biological vision. Despite training on thousands of images of object categories, however, machine-learning networks are poorer generalizers, often fooled by adversarial images with very simple image manipulations that humans easily distinguish as a false image. Humans, on the other hand, can generalize object classes from very few samples. Here we provide a dataset of novel object classifications in humans. We gathered thousands of crowd-sourced human responses to novel objects embedded either with 1 or 16 context sample(s). Human decisions and stimuli together have the potential to be re-used (1) as a tool to better understand the nature of the gap in category learning from few samples between human and machine, and (2) as a benchmark of generalization across machine learning networks.

9.
Vision Res ; 165: 98-108, 2019 12.
Artigo em Inglês | MEDLINE | ID: mdl-31707254

RESUMO

One aspect of human vision unmatched by machines is the capacity to generalize from few samples. Observers tend to know when novel objects are in the same class despite large differences in shape, material or viewpoint. A major challenge in studying such generalization is that participants can see each novel sample only once. To overcome this, we used crowdsourcing to obtain responses from 500 human observers on 20 novel object classes, with each stimulus compared to 1 or 16 related objects. The results reveal that humans generalize from sparse data in highly systematic ways with the number and variance of the samples. We compared human responses to 'ShapeComp', an image-computable model based on >100 shape descriptors, and 'AlexNet', a convolution neural network that roughly matches humans at recognizing 1000 categories of real-world objects. With 16 samples, the models were consistent with human responses without free parameters. Thus, when there are a sufficient number of samples, observers rely on shallow but efficient processes based on a fixed set of features. With 1 sample, however, the models required different feature weights for each object. This suggests that one-shot categorization involves more sophisticated processes that actively identify the unique characteristics underlying each object class.


Assuntos
Atenção/fisiologia , Percepção de Forma/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Reconhecimento Psicológico/fisiologia , Humanos , Estimulação Luminosa/métodos
10.
Sci Rep ; 9(1): 6263, 2019 04 18.
Artigo em Inglês | MEDLINE | ID: mdl-31000759

RESUMO

In the field of spatial coding it is well established that we mentally represent objects for action not only relative to ourselves, egocentrically, but also relative to other objects (landmarks), allocentrically. Several factors facilitate allocentric coding, for example, when objects are task-relevant or constitute stable and reliable spatial configurations. What is unknown, however, is how object-semantics facilitate the formation of these spatial configurations and thus allocentric coding. Here we demonstrate that (i) we can quantify the semantic similarity of objects and that (ii) semantically similar objects can serve as a cluster of landmarks that are allocentrically coded. Participants arranged a set of objects based on their semantic similarity. These arrangements were then entered into a similarity analysis. Based on the results, we created two semantic classes of objects, natural and man-made, that we used in a virtual reality experiment. Participants were asked to perform memory-guided reaching movements toward the initial position of a target object in a scene while either semantically congruent or incongruent landmarks were shifted. We found that the reaching endpoints systematically deviated in the direction of landmark shift. Importantly, this effect was stronger for shifts of semantically congruent landmarks. Our findings suggest that object-semantics facilitate allocentric coding by creating stable spatial configurations.


Assuntos
Semântica , Percepção Espacial/fisiologia , Adolescente , Adulto , Humanos , Memória , Experimentação Humana não Terapêutica , Desempenho Psicomotor , Virtudes , Adulto Jovem
11.
J Vis ; 17(12): 7, 2017 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29049594

RESUMO

We measured perceptual judgments of category, material attributes, affordances, and similarity to investigate the perceptual dimensions underlying the visual representation of a broad class of natural dynamic flows (sea waves, smoke, and windblown foliage). The dynamic flows were looped 3-s movies windowed with circular apertures of two sizes to manipulate the level of spatial context. In low levels of spatial context (smaller apertures), human observers' judgments of material attributes and affordances were inaccurate, with estimates biased toward assumptions that the flows resulted from objects that were rigid, "pick-up-able," and not penetrable. The similarity arrangements showed dynamic flow clusters based partly on material, but dominated by color appearance. In high levels of spatial context (large apertures), observers reliably estimated material categories and their attributes. The similarity arrangements were based primarily on categories related to external, physical causes. Representational similarity analysis suggests that while shallow dimensions like color sometimes account for inferences of physical causes in the low-context condition, shallow dimensions cannot fully account for these inferences in the high-context condition. For the current broad data set of dynamic flows, the perceptual dimensions that best account for the similarity arrangements in the high-context condition are related to the intermolecular bond strength of a material's underlying physical structure. These arrangements are also best related to affordances that underlie common motor activities. Thus, the visual system appears to use an efficient strategy to resolve flow ambiguity; vision will sometimes rely on local, image-based, statistical properties that can support reliable inference of external physical causes, and other times it uses deeper causal knowledge to interpret and use flow information to the extent that it is useful for everyday action decisions.


Assuntos
Atenção/fisiologia , Percepção de Forma/fisiologia , Julgamento/fisiologia , Percepção de Movimento/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Humanos , Estimulação Luminosa
12.
Front Syst Neurosci ; 9: 156, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26635546

RESUMO

A central puzzle in vision science is how perceptions that are routinely at odds with physical measurements of real world properties can arise from neural responses that nonetheless lead to effective behaviors. Here we argue that the solution depends on: (1) rejecting the assumption that the goal of vision is to recover, however imperfectly, properties of the world; and (2) replacing it with a paradigm in which perceptions reflect biological utility based on past experience rather than objective features of the environment. Present evidence is consistent with the conclusion that conceiving vision in wholly empirical terms provides a plausible way to understand what we see and why.

13.
Front Psychol ; 6: 1072, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26283998

RESUMO

Based on electrophysiological and anatomical studies, a prevalent conception is that the visual system recovers features of the world from retinal images to generate perceptions and guide behavior. This paradigm, however, is unable to explain why visual perceptions differ from physical measurements, or how behavior could routinely succeed on this basis. An alternative is that vision does not recover features of the world, but assigns perceptual qualities empirically by associating frequently occurring stimulus patterns with useful responses on the basis of survival and reproductive success. The purpose of the present article is to briefly describe this strategy of vision and the evidence for it.

14.
Front Comput Neurosci ; 8: 134, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25404912

RESUMO

The responses of visual neurons in experimental animals have been extensively characterized. To ask whether these responses are consistent with a wholly empirical concept of visual perception, we optimized simple neural networks that responded according to the cumulative frequency of occurrence of local luminance patterns in retinal images. Based on this estimation of accumulated experience, the neuron responses showed classical center-surround receptive fields, luminance gain control and contrast gain control, the key properties of early level visual neurons determined in animal experiments. These results imply that a major purpose of pre-cortical neuronal circuitry is to contend with the inherently uncertain significance of luminance values in natural stimuli.

15.
J Vis ; 14(9)2014 Aug 19.
Artigo em Inglês | MEDLINE | ID: mdl-25139864

RESUMO

All images are highly ambiguous, and to perceive 3-D scenes, the human visual system relies on assumptions about what lighting conditions are most probable. Here we show that human observers' assumptions about lighting diffuseness are well matched to the diffuseness of lighting in real-world scenes. We use a novel multidirectional photometer to measure lighting in hundreds of environments, and we find that the diffuseness of natural lighting falls in the same range as previous psychophysical estimates of the visual system's assumptions about diffuseness. We also find that natural lighting is typically directional enough to override human observers' assumption that light comes from above. Furthermore, we find that, although human performance on some tasks is worse in diffuse light, this can be largely accounted for by intrinsic task difficulty. These findings suggest that human vision is attuned to the diffuseness levels of natural lighting conditions.


Assuntos
Luz , Visão Ocular/fisiologia , Percepção Visual/fisiologia , Humanos , Estimulação Luminosa , Fotometria , Psicofísica
16.
Proc Natl Acad Sci U S A ; 111 Suppl 3: 10868-72, 2014 Jul 22.
Artigo em Inglês | MEDLINE | ID: mdl-25024184

RESUMO

Understanding why spectra that are physically the same appear different in different contexts (color contrast), whereas spectra that are physically different appear similar (color constancy) presents a major challenge in vision research. Here, we show that the responses of biologically inspired neural networks evolved on the basis of accumulated experience with spectral stimuli automatically generate contrast and constancy. The results imply that these phenomena are signatures of a strategy that biological vision uses to circumvent the inverse optics problem as it pertains to light spectra, and that double-opponent neurons in early-level vision evolve to serve this purpose. This strategy provides a way of understanding the peculiar relationship between the objective world and subjective color experience, as well as rationalizing the relevant visual circuitry without invoking feature detection or image representation.


Assuntos
Evolução Biológica , Luz , Rede Nervosa/fisiologia , Percepção Visual/fisiologia , Cor , Córnea/fisiologia , Humanos , Modelos Neurológicos , Neurônios/fisiologia , Estimulação Luminosa , Retina/fisiologia , Sinapses/fisiologia
17.
J Neurosci ; 32(11): 3679-96, 2012 Mar 14.
Artigo em Inglês | MEDLINE | ID: mdl-22423090

RESUMO

A central goal of visual neuroscience is to relate the selectivity of individual neurons to perceptual judgments, such as detection of a visual pattern at low contrast or in noise. Since neurons in early areas of visual cortex carry information only about a local patch of the image, detection of global patterns must entail spatial pooling over many such neurons. Physiological methods provide access to local detection mechanisms at the single-neuron level but do not reveal how neural responses are combined to determine the perceptual decision. Behavioral methods provide access to perceptual judgments of a global stimulus but typically do not reveal the selectivity of the individual neurons underlying detection. Here we show how the existence of a nonlinearity in spatial pooling does allow properties of these early mechanisms to be estimated from behavioral responses to global stimuli. As an example, we consider detection of large-field sinusoidal gratings in noise. Based on human behavioral data, we estimate the length and width tuning of the local detection mechanisms and show that it is roughly consistent with the tuning of individual neurons in primary visual cortex of primate. We also show that a local energy model of pooling based on these estimated receptive fields is much more predictive of human judgments than competing models, such as probability summation. In addition to revealing underlying properties of early detection and spatial integration mechanisms in human cortex, our findings open a window on new methods for relating system-level perceptual judgments to neuron-level processing.


Assuntos
Neurônios/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Estimulação Luminosa/métodos , Percepção Espacial/fisiologia , Córtex Visual/fisiologia , Feminino , Humanos , Masculino , Dinâmica não Linear , Vias Visuais/fisiologia
18.
Proc Natl Acad Sci U S A ; 108(30): 12551-3, 2011 Jul 26.
Artigo em Inglês | MEDLINE | ID: mdl-21746935

RESUMO

Every biological or artificial visual system faces the problem that images are highly ambiguous, in the sense that every image depicts an infinite number of possible 3D arrangements of shapes, surface colors, and light sources. When estimating 3D shape from shading, the human visual system partly resolves this ambiguity by relying on the light-from-above prior, an assumption that light comes from overhead. However, light comes from overhead only on average, and most images contain visual information that contradicts the light-from-above prior, such as shadows indicating oblique lighting. How does the human visual system perceive 3D shape when there are contradictions between what it assumes and what it sees? Here we show that the visual system combines the light-from-above prior with visual lighting cues using an efficient statistical strategy that assigns a weight to the prior and to the cues and finds a maximum-likelihood lighting direction estimate that is a compromise between the two. The prior receives surprisingly little weight and can be overridden by lighting cues that are barely perceptible. Thus, the light-from-above prior plays a much more limited role in shape perception than previously thought, and instead human vision relies heavily on lighting cues to recover 3D shape. These findings also support the notion that the visual system efficiently integrates priors with cues to solve the difficult problem of recovering 3D shape from 2D images.


Assuntos
Percepção Visual/fisiologia , Teorema de Bayes , Percepção de Forma/fisiologia , Humanos , Luz , Masculino , Modelos Psicológicos , Estimulação Luminosa , Psicofísica , Adulto Jovem
19.
J Vis ; 10(11): 15, 2010 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-20884510

RESUMO

Bayesian cue combination models have been used to examine how human observers combine information from several cues to form estimates of linear quantities like depth. Here we develop an analogous theory for circular quantities like planar direction. The circular theory is broadly similar to the linear theory but differs in significant ways. First, in the circular theory the combined estimate is a nonlinear function of the individual cue estimates. Second, in the circular theory the mean of the combined estimate is affected not only by the means of individual cues and the weights assigned to individual cues but also by the variability of individual cues. Third, in the circular theory the combined estimate can be less certain than the individual estimates, if the individual estimates disagree with one another. Fourth, the circular theory does not have some of the closed-form expressions available in the linear theory, so data analysis requires numerical methods. We describe a vector sum model that gives a heuristic approximation to the circular theory's behavior. We also show how the theory can be extended to deal with spherical quantities like direction in three-dimensional space.


Assuntos
Sinais (Psicologia) , Percepção de Forma/fisiologia , Reconhecimento Visual de Modelos/fisiologia , Percepção Espacial/fisiologia , Simulação por Computador , Humanos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...